Extremal Segments in Random Sequences 
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We investigate the probability for the largest segment in with total displacement Q in an JV-step 
random walk to have length L. Using analytical, exact enumeration, and Monte Carlo methods, 
we reveal the complex structure of the probability distribution in the large N limit. In particular, 
the size of the longest loop has a distribution with a square-root singularity at I = L/N = 1, an 
essential singularity at I = 0, and a discontinuous derivative at I = 1/2. 



Investigation of the ground states of randomly charged 
polymers jlj suggests that in order to take maximal ad- 
vantage of condensation energy and to diminish the ef- 
fects of long range repulsion of the excess charges, the 
polymer will select a necklace-like configuration, consist- 
ing of a few large, almost neutral globules, connected by 
narrow chains. In general this presents a complicated en- 
ergy minimization problem. Some aspects of the solution 
can be determined by asking a simpler question: What is 
the length of the longest segment of the random sequence 
(RS) of charges that has total charge Ql Alternatively, 
one can think of a one dimensional random walk (RW) in 
which the longest segment with an end-to-end distance 
Q is to be found. The problem resembles certain classical 
RW problems such as the problem of first and last 
arrival to a given point, or the special case of the last re- 
turn to the starting point of the RW. However, the search 
for the longest segment of the RW, among all possible 
starting points, creates a more complicated problem. We 
combine Monte Carlo (MC) and exact enumeration stud- 
ies, with some exact analytical results in certain simple 
limits, to demonstrate some remarkable properties of the 
distribution of the maximal-length segments. 



A RS is described by a sequence of random charges 
{li} {i = !>•••! N), where qi = ±1 with equal probabil- 
ity. Fig. [I] depicts an example of the accumulated charge 
Si = J2]=i Qj I0r a (Sq = 0.) Every segment of 
the sequence between, say, steps i and j, has a certain 
charge Q = Sj — Si. Such a segment will be denoted 
as a Q-segment. (A 0-segment corresponds to a loop 
in a RW, i.e., a segment for which the positions at the 
beginning and the end coincide.) Consider the set of all 
Q-segments for a fixed value of Q. Our task is to find 
segments of largest length L among these. Fig. |l| shows 
the longest O-segments and the longest 4-segments, in 
a RS with N = 24. Clearly, the longest Q-segment 
does not have to be unique. We are interested in the 
probability Pn(L,Q) that L is the length of the longest 
Q-segment in a RS of length N. For \Q\ > 0, the set 
of Q-segments for a given sequence may be empty, and 
therefore "£l=o P n( l > \Q\ > 0) < 1. 



Q = 4 

FIG. 1. Example of a RS. In this case, the longest O-seg- 
ments have lengths L = 18 (dotted lines), while the longest 
4-segments (dot-dashed lines) have lengths L — 22. 

Most properties of RSs have simple continuum limits. 
For example, the probability that an TV-element RS has 
total charge Q (for even N + Q) is 
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Similarly, we expect Pn(L,Q) to approach a simple 
form when N, L,Q — > oo, while the reduced length 
£ = L/N and the reduced charge q = Q/y/N are kept 
constant. In that (continuum) limit it is more con- 
venient to work with the probability density p(£, q) — 
§ [P N (L, Q) + P N (L + 1, Q)\. (At most one of the two 
probabilities is nonzero, since P/v = for odd L + Q.) In 
certain cases, Pjv can be calculated exactly, especially for 
very small values of L and N—L, and for arbitrary Q. For 
example, P N (L = N,Q) = W N (Q), while P N (L = N - 
2, Q) = \ [W N -a(Q + 2) + 2W N - A (Q) + W N ^(Q - 2) 
Similarly, one can find expressions for very small L || 
However, we were unable to find a general expression for 
arbitrary N, L and Q. We performed exact enumera- 
tion studies of Pn(L,0) for N < 36. Results for few 
values of N are shown in Fig. ||a. The results converge 
extremely fast to the continuum distribution p(£, 0). The 
solid curve in the same figure depicts the results of a MC 
evaluation of the probability density from 10 s randomly 
selected sequences of length N = 1000. 
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FIG. 2. (a) Probability density of O-segments as a function of reduced length I. Circles, squares, and triangles depict the 
exact enumeration results for N = 8, 16, and 32, respectively. The solid line shows results of MC simulations (see text), (b) 
Probability density of Q-segments as a function of reduced charge q and reduced length I. The results have been obtained 
from MC simulations (see text). 



The probability density p(£, 0) shown in Fig. |^a has 
several remarkable properties: (a) MC results show that 
p at I = \ is very close to unity (1.004 ± 0.006). At that 
point the slope of the curve changes by an order of mag- 
nitude, (b) For I — > 0, the function exhibits an essential 
singularity of the form ~ £~ 2 exp(— B/£), where B w 1.7. 
(c) For £ — ► 1, the function diverges as (1 — t)^ 1 ! 2 . Qual- 
itatively, this behaviour can be understood as follows: 
The length of the longest 0-segment strongly depends on 
the overall charge Q Q of the chain. For simplicity, let 
us assume that it depends only on Q D . Then, for small 
values of Q a we can relate £ = 1 — aQ^/N, where a is 
of order unity. For very large Q and, thus, for very 
small £, the length of the longest 0-segment will be of 
order of a scale at which the random excursion of the 
RW becomes comparable to the drift produced by Q , 
i.e., when L 1 / 2 m LQ a /N, and therefore £ m N/Q 2 D . By 
applying the relation p(£, 0) = {N/2)W N (Q )\dQ /d£\ in 
both limits, we correctly reproduce the square-root di- 
vergence for I — > 1, and the exp(const/€) singularity for 
£ — > 0. (The leading pre-exponential power is not re- 
produced correctly in the latter case. A more involved 
argument || also reproduces this power correctly.) It 
is interesting to note that, by matching the asymptotic 
form of p(£, 0) near £ = 1 with P N (L, 0) for L = N - 2, 
we reproduce almost the exact value of the prefactor, i.e., 
the discrete distribution approaches its asymptotic (con- 
tinuum) form within a few steps of the extreme L = N. 

Fig. I^Jd depicts the full probability density p(£, q), ob- 
tained from a MC evaluation of 10 7 sequences of length 
N = 1024. This figure demonstrates further peculiarities 
of p(£,q): For fixed £, the q-dependence of p is quali- 
tatively different for £ > \ and £ < \. In the former 



case, the distribution has a single peak at q = 0, and 
the areas Ag = dqp{£, q) under fixed-f sections have 

the form const/Vl — £■ In the latter case, however, we 
see two peaks, and Ag is approximately linear in £ for 
0.15<£<0.5. 

An interesting and potentially useful integral relation 
exists between the probabilities Pn(L,Q). The number 
of sequences in which the longest Q-segment has length 
L is 2 n Pn(L, Q). In the particular case of L > N/2, 
we can construct all such sequences as follows: First 
we construct all sequences of length 2(A^ — L) with 
longest Q /_se g men t of length N — L, i.e., exactly half 
of its total length. There are 2 2( n ~ l) P 2 (n-l) (N-L, Q') 
such sequences. Next, we construct all sequences of 
length 2L — N and total charge Q — Q 1 . There are 
2 2L ~ n W2l-n{Q - Q') such sequences. Finally, by in- 
serting any chain from the second group into any chain 
from the first group at its midpoint, and repeating this 
process for all possible values of Q', we will reproduce all 
the sequences of total length N, with longest Q-segments 
of length L. It is easy to verify that this process pro- 
duces every desired configuration once and only once. 
The necessary condition, however, is that L > N/2, i.e., 
the longest Q-segment must include the midpoint of the 
sequence. In the continuum limit, this relation can be 
expressed as 
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where the reduced variable q' = Q' / \/2(N — L), and the 
Gaussian term in the integrand represents the continuum 
limit of W2L-n(Q— Q')- Equation (||) gives the probabil- 
ities for any £ > 4 in terms of their value at I — \ , and 
reduces to identity in the £ — > i limit. By integrating 
both sides of Eq.(|) over q, we find a relation between 
the areas At, for £ > 4 : 
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which confirms the observation from the MC data that 
for £ > | , simply increases as 1/ VI — In the £ — > 1 
limit, the variable g' disappears from the exponent in 
Eq.(||), and the relation reduces to 



1,?) = 
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This relation both confirms our contention that p(£, 0) 
has a square-root divergence A/ yj 7r(l — with A = 
Ax/2, an d demonstrates that the fixed-^ sections of the 
surface in Fig. ||b approach a pure Gaussian shape when 
I — > 1. In addition to the MC study, we performed an 
exact enumeration study to determine A for sequences 
with N < 30, and found that it extrapolates to the value 
1.011 ± 0.001, in perfect agreement with the MC result: 
Definitely larger than unity, but surprisingly close to it. 

We did not find analogous integral relations for I < i . 
Here, the situation is complicated by the fact that, in a 



given sequence, there may be several longest Q" se g mer rts 
that are disjoint. The g-dependence of p(£, q) for small 
values of £ has a minimum at q = 0. The minimum 
disappears as £ increases, at £ as \. Further analysis is 
necessary to understand the behaviour of p(£, q) in this 
region. 

In conclusion, we demonstrated that the probability 
density p(£, q) has some peculiar and unexpected prop- 
erties and very rich behaviour, despite the apparent sim- 
plicity of its formulation, and its similarity to classical 
RW problems. More analysis is needed to fully under- 
stand various properties of the extremal segments in a 
RS. 
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